Crochemore's String Matching Algorithm: Simplification, Extensions, Applications

نویسندگان

  • Juha Kärkkäinen
  • Dominik Kempa
  • Simon J. Puglisi
چکیده

We address the problem of string matching in the special case where the pattern is very long. First, constant extra space algorithms are desirable with long patterns, and we describe a simplified version of Crochemore’s algorithm retaining its linear time complexity and constant extra space usage. Second, long patterns are unlikely to occur in the text at all. Thus we define a generalization of string matching called Longest Prefix Matching that asks for the occurrences of the longest prefix of the pattern occurring in the text at least once, and modify the simplified Crochemore’s algorithm to solve this problem. Finally, we define and solve the problem of Sparse Longest Prefix Matching that is useful when the pattern has to be split into multiple pieces because it is too long to be processed in one piece. These problems are motivated by and have application in Lempel-Ziv (LZ77) factorization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Multiple String Matching Using Streaming SIMD Extensions Technology

Searching for all occurrences of a given set of patterns in a text is a fundamental problem in computer science with applications in many fields, like computational biology and intrusion detection systems. In the last two decades a general trend has appeared trying to exploit the power of the word RAM model to speed-up the performances of classical string matching algorithms. This study introdu...

متن کامل

Towards a Very Fast Multiple String Matching Algorithm for Short Patterns

Multiple exact string matching is one of the fundamental problems in computer science and finds applications in many other fields, among which computational biology and intrusion detection. It turns out that short patterns appear in many instances of such problems and, in most cases, sensibly affect the performances of the algorithms. Recent solutions in the field of string matching try to expl...

متن کامل

Fast Packed String Matching for Short Patterns

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two decades a general trend has appeared trying to exploit the power of the word RAM model to speed-up the performances of classical string matching algorithms. In ...

متن کامل

String Matching Application for Network Security

String matching is one of the key of network security, biological applications and many areas are benefited from a faster string matching algorithm. The effectiveness and efficiency of string matching algorithms is important for applications like as network intrusion detection systems, virus detection, medical science and web content filters system. This paper reviews what works has been done i...

متن کامل

A Quick String Matching Employing Mixing Up

Most of the current string matching algorithms behave slowly when the amount of patterns increases. In this paper a fast matching algorithm named SSEMatch was designed. PHADDW instruction from SSE (Streamed SIMD Extension) set was used in SSEMatch to produce data confusion, by which the patterns can be distributed into pseudo hash address such that there will be less patterns left for verificat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013